Overview

Dataset Statistics

Number of Variables 10
Number of Rows 18879
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 0
Duplicate Rows (%) 0.0%
Total Size in Memory 5.5 MB
Average Row Size in Memory 303.8 B
Variable Types
  • Numerical: 5
  • Categorical: 4
  • DateTime: 1

Dataset Insights

id is uniformly distributed Uniform
model_year is skewed Skewed
extra_features_count is skewed Skewed
extra_features_count has 7091 (37.56%) zeros Zeros

Variables


id

numerical

Approximate Distinct Count 18879
Approximate Unique (%) 100.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 302064
Mean 10042.9145
Minimum 0
Maximum 20097
Zeros 1
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • id is uniformly distributed
  • id is skewed left (γ1 = -0.0001)

Quantile Statistics

Minimum 0
5-th Percentile 1002.9
Q1 5031.5
Median 10048
Q3 15051.5
95-th Percentile 19086.1
Maximum 20097
Range 20097
IQR 10020

Descriptive Statistics

Mean 10042.9145
Standard Deviation 5799.3099
Variance 3.3632e+07
Sum 1.896e+08
Skewness -5.2887e-05
Kurtosis -1.1971
Coefficient of Variation 0.5775

make

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1346025
  • The largest value (Nissan) is over 39.34 times larger than the second largest value (Nissan Motor Egypt)

Length

Mean 6.2975
Standard Deviation 1.8658
Median 6
Minimum 6
Maximum 18

Sample

1st row Nissan
2nd row Nissan
3rd row Nissan
4th row Nissan
5th row Nissan

Letter

Count 117954
Lowercase Letter 98139
Space Separator 936
Uppercase Letter 19815
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Nissan, Nissan Motor Egypt) take over 50.0%
  • The largest value (nissan) is over 40.34 times larger than the second largest value (egypt)

model

categorical

Approximate Distinct Count 6
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1328191
  • The largest value (Sunny) is over 4.67 times larger than the second largest value (Qashqai)

Length

Mean 5.3528
Standard Deviation 0.7854
Median 5
Minimum 4
Maximum 14

Sample

1st row Juke
2nd row Juke
3rd row Juke
4th row Juke
5th row Juke

Letter

Count 101054
Lowercase Letter 82173
Space Separator 2
Uppercase Letter 18881
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Sunny, Qashqai) take over 50.0%
  • The largest value (sunny) is over 4.67 times larger than the second largest value (qashqai)

model_year

numerical

Approximate Distinct Count 43
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 302064
Mean 2016.192
Minimum 1918
Maximum 2024
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • model_year is skewed left (γ1 = -1.976)

Quantile Statistics

Minimum 1918
5-th Percentile 2008
Q1 2014
Median 2017
Q3 2020
95-th Percentile 2022
Maximum 2024
Range 106
IQR 6

Descriptive Statistics

Mean 2016.192
Standard Deviation 4.8076
Variance 23.1134
Sum 3.8064e+07
Skewness -1.976
Kurtosis 13.5976
Coefficient of Variation 0.002385
  • model_year is not normally distributed (p-value 1.2770559640043935e-09)
  • model_year has 475 outliers

kilometers

numerical

Approximate Distinct Count 582
Approximate Unique (%) 3.1%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 302064
Mean 94909.778
Minimum 0
Maximum 285000
Zeros 239
Zeros (%) 1.3%
Negatives 0
Negatives (%) 0.0%
  • kilometers is skewed right (γ1 = 0.2924)

Quantile Statistics

Minimum 0
5-th Percentile 9999
Q1 42000
Median 90000
Q3 139999
95-th Percentile 200000
Maximum 285000
Range 285000
IQR 97999

Descriptive Statistics

Mean 94909.778
Standard Deviation 60216.6318
Variance 3.626e+09
Sum 1.7918e+09
Skewness 0.2924
Kurtosis -0.7566
Coefficient of Variation 0.6345

transmission_type

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1393494
  • The largest value (Automatic) is over 14.95 times larger than the second largest value (Manual)

Length

Mean 8.8119
Standard Deviation 0.7274
Median 9
Minimum 6
Maximum 9

Sample

1st row Automatic
2nd row Automatic
3rd row Automatic
4th row Automatic
5th row Automatic

Letter

Count 166359
Lowercase Letter 147480
Space Separator 0
Uppercase Letter 18879
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories (Automatic, Manual) take over 50.0%
  • The largest value (automatic) is over 14.95 times larger than the second largest value (manual)

price

numerical

Approximate Distinct Count 820
Approximate Unique (%) 4.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 302064
Mean 272939.6684
Minimum 10000
Maximum 1384000
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • price is skewed right (γ1 = 1.5607)

Quantile Statistics

Minimum 10000
5-th Percentile 125000
Q1 180000
Median 247000
Q3 336500
95-th Percentile 510100
Maximum 1384000
Range 1374000
IQR 156500

Descriptive Statistics

Mean 272939.6684
Standard Deviation 129286.6323
Variance 1.6715e+10
Sum 5.1528e+09
Skewness 1.5607
Kurtosis 4.3838
Coefficient of Variation 0.4737
  • price is not normally distributed (p-value 5.533716119791844e-05)
  • price has 618 outliers

priced_at

datetime

Distinct Count 226.3906
Approximate Unique (%) 1.2%
Missing 0
Missing (%) 0.0%
Memory Size 151160
Minimum 2022-02-02 00:00:00
Maximum 2023-04-30 00:00:00

mileage_category

categorical

Approximate Distinct Count 5
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 1366045

Length

Mean 7.3579
Standard Deviation 1.7296
Median 8
Minimum 5
Maximum 9

Sample

1st row 200k+
2nd row 200k+
3rd row 0-50k
4th row 100k-150k
5th row 0-50k

Letter

Count 31393
Lowercase Letter 31393
Space Separator 0
Uppercase Letter 0
Dash Punctuation 17843
Decimal Number 88638
  • The top 2 categories (50k-100k, 0-50k) take over 50.0%

extra_features_count

numerical

Approximate Distinct Count 41
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 302064
Mean 8.765
Minimum 0
Maximum 40
Zeros 7091
Zeros (%) 37.6%
Negatives 0
Negatives (%) 0.0%
  • extra_features_count is skewed right (γ1 = 0.7359)

Quantile Statistics

Minimum 0
5-th Percentile 0
Q1 0
Median 7
Q3 15
95-th Percentile 26
Maximum 40
Range 40
IQR 15

Descriptive Statistics

Mean 8.765
Standard Deviation 9.0111
Variance 81.2005
Sum 165475
Skewness 0.7359
Kurtosis -0.4025
Coefficient of Variation 1.0281
  • extra_features_count is not normally distributed (p-value 1.1738527110405793e-23)
  • extra_features_count has 33 outliers

Interactions

Correlations

Missing Values